[V0 Deprecation] Remove code related to per-request logits processors#34400
Conversation
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
There was a problem hiding this comment.
Code Review
This pull request aims to remove per-request logits processors, which are deprecated in V1. The changes are spread across configuration, engine arguments, sampling parameters, and OpenAI protocol files to remove logits_processor_pattern and per-request logits_processors. The changes are generally correct and align with the goal. However, I've found that the removal is incomplete in a few places. Specifically, the logits_processors attribute is still present in SamplingParams, EngineArgs, and ModelConfig classes, which could lead to silent failures or confusion. I've left detailed comments on how to address this.
I am having trouble creating individual review comments. Click here to see my feedback.
vllm/engine/arg_utils.py (1329)
This change, along with others in this PR, effectively disables engine-level logits_processors. However, the logits_processors attribute still exists in EngineArgs and ModelConfig, and is accepted by LLM.__init__. To avoid confusion and silent failures, these should be removed as well if they are no longer intended to be used.
vllm/sampling_params.py (458-462)
While this check is being removed, the logits_processors attribute is still defined on the SamplingParams class (line 212), and as an argument to from_optional (line 280). To fully complete the deprecation and prevent users from thinking the parameter is still effective, the attribute should be removed from the class definition and method signatures entirely. Leaving it can lead to silent failures and confusion.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request effectively removes the deprecated per-request logits processors, which simplifies the API and codebase. The changes are clean and consistent across the modified files. However, I've identified a critical issue where model-level logits processors, configured via CLI arguments, are no longer being applied due to a missing parameter in the ModelConfig instantiation. This appears to be an unintended side effect of the refactoring. My review includes a specific comment and suggestion to fix this.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request effectively removes the deprecated per-request logits processors feature. The changes are comprehensive, touching configuration, argument parsing, API protocols, serving logic, and tests. Key removals include SamplingParams.logits_processors, ModelConfig.logits_processor_pattern, and the corresponding fields in ChatCompletionRequest and CompletionRequest.
I appreciate the thoroughness of this refactoring. The removal of related tests and mock configurations is clean. A notable improvement is the relocation of the validation logic for model-level logits processors from the serving layer into SamplingParams.verify(), which centralizes the logic and correctly uses a local import to prevent circular dependencies.
Overall, the changes are well-implemented, improve code clarity by removing deprecated code, and I see no issues with the implementation.
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Purpose
Removed the following as they're not valid in V1:
ModelConfig.logits_processor_patternSamplingParams.logits_processorsChatCompletionRequest.logits_processorCompletionRequest.logits_processorcc @afeldman-nm
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.